# A 260µW Infrared Gesture Recognition System-on-Chip for Smart Devices

Sechang Oh<sup>1</sup>, Ngoc Le Ba<sup>2</sup>, Suyoung Bang<sup>1</sup>, Junwon Jeong<sup>3</sup>, David Blaauw<sup>1</sup>, Tony T. Kim<sup>2</sup>, Dennis Sylvester<sup>1</sup>

<sup>1</sup>University of Michigan, MI; <sup>2</sup>Nanyang Technological University, Singapore; <sup>3</sup>Korea University, Korea

#### **Abstract**

This paper presents a low-power infrared motion detection system suitable for smart devices such as wearables. The SoC incorporates instrumentation chopper amplifiers (ICA), LPFs, ADCs, and a DSP. The low-noise ICAs amplify very low frequency  $\mu V$ -level thermopile outputs with 2.0 NEF and provide programmable gain modes. To reduce standby power the ICA uses lower current when the system is in idle mode. Wakeup can be triggered by detection of a simple gesture. For the LPF, source degeneration by pseudo-resistors and  $g_m$  division techniques are used for both improved linearity and 30Hz bandwidth. The DSP employs a motion history image technique to achieve low-power detection. The system consumes  $260\mu W$  in active mode and  $46\mu W$  in idle mode while processing  $16\times4$  infrared data at 30fps. A complete system demonstration is shown.

### Introduction

Recent demand for natural human-computer interfaces such as gesture recognition has increased, particularly for compact wearable devices. Cameras are currently the most common platform for gesture sensing, but they are highly sensitive to environmental light conditions. Extended range capacitive sensing [1] and ultrasonic techniques [2] have been explored but they consume significant energy due to their excitation source. In contrast, an infrared sensing system, in which a thermopile array directly converts incoming infrared radiation energy into electrical energy, is an appealing lowpower choice since the sensor array itself is passive. It can be fabricated in CMOS technology and generates voltages linearly proportional to the temperature difference between an object and the background environment [3, 4]. However, array sensitivity is just a few µV/°C and its time constant is several ms. To achieve ultra-low power gesture recognition, we propose an SoC including a low-noise instrumentation chopper amplifier for low frequency signals, a lowpower LPF for filtering out-band noise including the chopper frequency and its harmonics, an ADC, and a motion history image based [5] low-power DSP.

#### **Proposed Gesture Recognition SoC**

This paper targets a gesture sensing system using a thermopile array (Figs. 1 and 2). A hand emits infrared radiation with wavelength representing its temperature; this forms an image incident upon a 16×4 thermopile array. Each thermopile signal connects to an AFE path that consists of an ICA and LPF. The four row ADCs digitize the amplified/filtered signals using time-division multiplexing and the DSP then analyzes the waveform to detect gestures.

Fig. 3 shows the proposed ICA. Since the gesture signals are significantly impacted by 1/f noise, they are chopped to remove this 1/f noise and then sent through two amplifiers. Overall gain needs to be up to 80dB for a power-efficient high dynamic range system. C<sub>1</sub>/C<sub>2</sub> (15pF/150fF) and C<sub>3</sub>/C<sub>4</sub> (C<sub>4</sub>=20fF) set the gains for the Low-noise Amplifier (LNA) and Programmable-gain Amplifier (PGA), respectively. C<sub>3</sub> is programmable (200fF-3pF) for system flexibility. OTA1 and OTA2 are implemented with inverter-based cascode amplifiers to maximize g<sub>m</sub> and gain at a given current. The commonmode feedback (CMFB) amplifiers consume a fraction of the power using ratioed transistor sizes. As in typical noise-limited designs, the first amplifier stage consumes the majority of the total power (up to  $2.5\mu A$  current) to achieve sub- $\mu V$  noise while the PGA consumes just 90nA, constrained by the chopper bandwidth. Transistor sizes are chosen for optimal noise efficiency factor (NEF) and chopper frequency is 1kHz. The ICA high-pass corner is set by (R<sub>3</sub>C<sub>5</sub>)<sup>-1</sup> in the DC servo loop. R<sub>1</sub> and R<sub>2</sub> paths set the input common mode voltages and cancels the offsets. Fast-settling switches (FS<sub>1-3</sub>) are selectively turned on to reduce settling time when ICA settings are changed, decreasing the corresponding resistance by 100× in simulation.

The ICA outputs show ripple at the 1kHz chopping frequency and

its harmonics. These are removed with the proposed Gm-C LPF in Fig. 4. The two biquads are connected in series to form a 4<sup>th</sup> order filter. Since the gesture information resides in a low frequency range, the LPF bandwidth is set to 30Hz to achieve high SNR. CLPF is a capacitor array and is set to 8.9pF to approximately match AFE and thermopile pixel size. Considering  $f_{LPF3dB}=g_m/(2\pi C_{LPF})$ ,  $g_m$  in the nS range is required. To achieve this bias current must be extremely low, leading to potentially poor linearity. Thus, source degeneration and g<sub>m</sub> division techniques [6] are used in the LPF. The Gm-stage input current is divided by the series-parallel current mirror to effectively obtain g<sub>m</sub>/32. To enhance linearity, input pair sources are degenerated by pseudo-resistors whose gates are controlled by inputs. Simulation results show the resulting g<sub>m</sub> is linear within ±100mV input range (defined by full width at half-maximum). The CMFB amplifier replicates voltages in the main Gm stage and sets the common mode output voltage. LPF outputs in each row are time-multiplexed via a 16:1 analog multiplexer and connected to a differential 8b SAR ADC (Fig. 5). The ADC sampling rate is 1kS/s.

Fig. 6 describes the overall structure of the proposed motion recognition DSP. There are three separate memories to store frame data. The first memory contains the motion history image (MHI), which is the difference between the current and previous frames (Fig. 7). The second and third memories are used to store two continuous frames once motion is detected. Detection modules use data from the three memories to analyze the gesture. Fig. 8 shows the top level design for the proposed gesture detection algorithm. Motion is detected by counting the number of pixels having significant change in value (i.e., ADC output code) between the current and previous frame. If there is no motion for a period of time the processor goes into an idle mode with only a simpler motion detecting circuit enabled to save power.

When motion is detected, a sweeping algorithm uses two motion history image frames to analyze the motion. In this process each row and column of the MHI frames are first summed. The type of movement (diagonal, up-down, or left-right) is then discerned based on the number of peaks found in the row- and column-wise sums. In a diagonal sweep both row and column sums will exhibit clear peaks (i.e., four total peaks detected) whereas in up-down or left-right sweeps only two peaks are observed due to constant behavior in either the horizontal or vertical direction. This is shown in Fig. 7, which illustrates the principle of detection for sweeping gestures. Up-down or left-right direction can be determined based on the relative positions of negative to positive peaks, as seen in Fig. 7. This approach allows the DSP to accurately identify specific gestures.

## **Measured Results**

The proposed gesture recognition SoC is implemented in 65nm CMOS. The ICA input noise density is  $31\text{nV}/\sqrt{\text{Hz}}$  in active mode and  $130\text{nV}/\sqrt{\text{Hz}}$  in idle mode (Fig. 9), and chopping successfully suppresses 1/f noise. LPF bandwidth is adjustable between 10-150Hz by CLPF changes (Fig. 10). Fig. 11 shows the LPF noise spectrum and HD3 is 48.9dB at Inlept=0.1Vpp. Fig. 12 shows ADC performance. The system is demonstrated with an external  $16\times4$  thermopile and lens, and Fig. 13 shows detection of a hand sweeping across the field of view. Fig. 15 summarizes measured results and compares with recent works. This work represents the first SoC for gesture sensing applications using a thermopile array. Its size  $(8.1\text{mm}^2, \text{Fig. 14})$  and power  $(260\mu\text{W})$  and  $46\mu\text{W}$  are suitable for emerging smart devices.

## References

- [1] Yingzhe Hu, ISSCC 2014.
- [3] M. Hirota, SPIE 2003.
- [5] C. Hsieh, ICSPS 2010.
- [7] Qinwen Fan, JSSC 2011.[9] P. Bruschi, JSSC 2007.
- [6] A. Arnaud, JSSC 2006.
  - [8] Yen-Po Chen, VLSIC 2014. [10] S.-Y. Lee, TBCAS 2009.

[2] R. Przybyla, ISSCC 2011.

[4] H. Kawanishi, ISSCC 2008.



Fig. 15. Performance summary and comparison with prior works.